Index-CloseMiner: An improved algorithm for mining frequent closed itemset

نویسندگان

  • Wei Song
  • Bingru Yang
  • Zhangyan Xu
چکیده

The set of frequent closed itemsets determines exactly the complete set of all frequent itemsets and is usually much smaller than the latter. This paper proposes an improved algorithm for mining frequent closed itemsets. Firstly, the index array is proposed, which is used for discovering those items that always appear together. Then, by using bitmap, an algorithm for computing index array is presented. Thirdly, based on the heuristic information provided by index array, frequent items, which co-occur together and share the same support, are merged together. Thus, initial generators are calculated. Finally, based on index array, reduced pre-set and reduced post-set are proposed. It is proved that the reduced pre-set and reduced post-set not only retain the function of pre-set and post-set, but also have smaller sizes. Therefore, the redundant items in pre-set and post-set are deleted, thus making it possible to save a lot of work related to inclusion check. The experimental results show that the proposed algorithm is efficient especially on dense dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Algorithm of Mining Frequent Closed Itemsets Based on Index Array

The set of frequent closed itemsets determines exactly the complete set of all frequent itemsets and is usually much smaller than the latter. In this paper, an algorithm based on index array for mining frequent closed itemsets, Index-FCI is proposed. The vertical BitTable is adopted to compress the dataset for counting fast the support. To make use of the horizontal BitTable, the index array co...

متن کامل

Accelerating Closed Frequent Itemset Mining by Elimination of Null Transactions

The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...

متن کامل

An Efficient Incremental Algorithm to Mine Closed Frequent Itemsets over Data Streams

The purpose of this work is to mine closed frequent itemsets from transactional data streams using a sliding window model. An efficient algorithm IMCFI is proposed for Incremental Mining of Closed Frequent Itemsets from a transactional data stream. The proposed algorithm IMCFI uses a data structure called INdexed Tree(INT) similar to NewCET used in NewMoment[5]. INT contains an index table Item...

متن کامل

A New Algorithm for High Average-utility Itemset Mining

High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...

متن کامل

LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets

For a transaction database, a frequent itemset is an itemset included in at least a specified number of transactions. A frequent itemset P is maximal if P is included in no other frequent itemset, and closed if P is included in no other itemset included in the exactly same transactions as P . The problems of finding these frequent itemsets are fundamental in data mining, and from the applicatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Intell. Data Anal.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2008